Skip to content

fix(docker): honor configured supervisor image#1935

Merged
elezar merged 1 commit into
mainfrom
fix-local-e2e
Jun 25, 2026
Merged

fix(docker): honor configured supervisor image#1935
elezar merged 1 commit into
mainfrom
fix-local-e2e

Conversation

@elezar

@elezar elezar commented Jun 16, 2026

Copy link
Copy Markdown
Member

Summary

Fix the local Docker e2e path and Docker driver supervisor resolution so a configured supervisor_image is honored before local or sibling supervisor binaries. The Docker e2e wrapper now builds or pulls a supervisor image and writes supervisor_image into gateway TOML instead of generating a temporary supervisor_bin override.

Related Issue

N/A

Changes

  • Prefer explicit supervisor_bin, then configured supervisor_image, then sibling/local target binaries, then the default release-matched supervisor image.
  • Add a Docker driver unit test proving configured supervisor_image wins over local supervisor binaries.
  • Simplify e2e/with-docker-gateway.sh to use the configured supervisor image path and build openshell/supervisor:dev when no image override is provided.
  • Update Docker driver and gateway config docs to describe supervisor image extraction behavior.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated

Not rerun while updating this PR description.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Documentation updated

@elezar elezar requested review from a team, derekwaynecarr and mrunalp as code owners June 16, 2026 19:15
@github-actions

Copy link
Copy Markdown

@elezar elezar added test:e2e Requires end-to-end coverage test:e2e-gpu Requires GPU end-to-end coverage labels Jun 16, 2026
@github-actions

Copy link
Copy Markdown

Label test:e2e applied for 5b97b26. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

@github-actions

Copy link
Copy Markdown

Label test:e2e-gpu applied for 5b97b26. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute GPU E2E after building the required supervisor image once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

@elezar

elezar commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

gator-agent

PR Review Status

Validation: this is maintainer-authored, project-valid Docker e2e and SSH environment stabilization work. It keeps local Docker e2e aligned with the configured supervisor image and prevents host linker variables from breaking the external OpenSSH process while preserving the proxy child environment.
Head SHA: 5b97b26a7af0618d2b86569e01a9a94fcccd6406

Review findings:

  • No blocking findings remain.
  • Nonblocking suggestion from the independent review: the internal DockerComputeConfig.supervisor_image comment is stale because configured supervisor_image now takes precedence over sibling/local builds when supervisor_bin is unset.

Docs: relevant Docker supervisor image behavior is reflected in the Docker driver README and docs/reference/gateway-config.mdx; no docs navigation change appears necessary.
E2E: test:e2e and test:e2e-gpu are already applied because this affects Docker e2e/supervisor behavior.
Checks: OpenShell / Branch Checks, OpenShell / Helm Lint, and DCO are passing. OpenShell / E2E and OpenShell / GPU E2E are pending for this head.

Next state: gator:watch-pipeline

@elezar elezar added the gator:watch-pipeline Gator is monitoring PR CI/CD status label Jun 16, 2026
@elezar

elezar commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After CI Update

I re-evaluated latest head 5b97b26a7af0618d2b86569e01a9a94fcccd6406 after the required E2E gates completed.

Disposition: not resolved.

Remaining items:

  • Still blocking: OpenShell / E2E failed. The e2e / E2E (rust-docker) and e2e / E2E (rust-podman-rootless) jobs both failed tests/forward_proxy_graphql_l7.rs::graphql_l7_enforces_allow_and_deny_rules_on_forward_and_connect_paths.
  • Docker observed forward_query_allowed = 403 where the test expected 200.
  • Rootless Podman observed forward_get_query_allowed = 403 where the test expected 200.
  • OpenShell / Branch Checks, OpenShell / Helm Lint, DCO, and OpenShell / GPU E2E are green for this head.

Next state: gator:in-review

@elezar elezar added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:watch-pipeline Gator is monitoring PR CI/CD status labels Jun 16, 2026
@elezar elezar requested a review from maxamillion as a code owner June 17, 2026 07:05
@elezar elezar enabled auto-merge (squash) June 17, 2026 12:19
pimlock
pimlock previously approved these changes Jun 17, 2026
Comment thread crates/openshell-driver-docker/src/lib.rs
Comment thread crates/openshell-cli/src/ssh.rs Outdated
@elezar elezar added gator:blocked Gator is blocked by process or repository gates and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 23, 2026
@elezar

elezar commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Blocked

Gator is blocked by merge conflicts with the base branch. GitHub reports mergeable_state=dirty for head 242ace258f3cb2eda8dd8d4150db6ba600da10e7.

Next action: @elezar, please rebase or merge the base branch and resolve the conflicts, then push an updated head so gator can re-check review state and CI.

@elezar

elezar commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After Author Update

I re-evaluated latest head af7a19c85f5fdfb770f4fe851e3fd5a3f8bae5e2 after the branch update since the 2026-06-23 10:10 UTC merge-conflict blocker.

Disposition: partially resolved.

Remaining items:

  • Resolved: GitHub now reports mergeable=true and mergeable_state=blocked, so the previous dirty merge-conflict blocker is cleared.
  • Still blocking: OpenShell / Branch Checks failed for this head. Both Rust lint jobs fail on crates/openshell-driver-docker/src/lib.rs:2979 with clippy::collapsible_if.
  • OpenShell / E2E and OpenShell / GPU E2E are still pending for this head; gator will re-check after the lint failure is fixed and CI completes.

Next state: gator:in-review

@elezar elezar added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:blocked Gator is blocked by process or repository gates labels Jun 23, 2026
@elezar

elezar commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Maintainer Approval Needed

Gator validation and PR monitoring are complete.

Validation: maintainer-authored, project-valid local Docker e2e and SSH environment stabilization work.
Review: no blocking review findings remain. The earlier CI blocker on head af7a19c85f5fdfb770f4fe851e3fd5a3f8bae5e2 is resolved.
Docs: relevant Docker supervisor image behavior is covered in the Docker driver README and docs/reference/gateway-config.mdx; no docs navigation change appears necessary.
Checks: OpenShell / Branch Checks, OpenShell / Helm Lint, DCO, and required CI gate statuses are passing for head af7a19c85f5fdfb770f4fe851e3fd5a3f8bae5e2.
E2E: test:e2e and test:e2e-gpu are applied; OpenShell / E2E and OpenShell / GPU E2E are passing.

Human maintainer approval or merge decision is now required.

@elezar elezar removed the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 23, 2026
@elezar elezar added the gator:approval-needed Gator completed review; maintainer approval needed label Jun 23, 2026
@elezar elezar added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:approval-needed Gator completed review; maintainer approval needed labels Jun 23, 2026
@elezar

elezar commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After Author Update

I re-evaluated latest head 3ed9a9796ac83be84c772e8e071da15d41a3b3c5 after the head changed from the prior approval handoff, and ran a bounded independent re-review of the current diff.

Disposition: partially resolved, but not ready to leave review.

Remaining items:

  • Still pending: OpenShell / Branch Checks, OpenShell / E2E, and OpenShell / GPU E2E are queued for this head, so the prior approval handoff for af7a19c85f5fdfb770f4fe851e3fd5a3f8bae5e2 is stale.
  • Review warning: e2e/with-docker-gateway.sh now builds the default local openshell/supervisor:dev image without pinning the build platform to the Docker daemon architecture. If an ambient DOCKER_PLATFORM is set to a different platform, local Docker e2e can build/extract the wrong-architecture supervisor and fail later with an exec-format error. Please derive the daemon architecture and pass DOCKER_PLATFORM=linux/<daemon-arch> into the local supervisor image build, or explain why the inherited platform cannot be mismatched in this path.
  • Review warning: the new crates/openshell-cli/src/ssh.rs tests that call temp_env::with_vars should take TEST_ENV_LOCK, matching the module's existing process-environment tests, to avoid parallel test flakes.

Docs: relevant Docker supervisor image behavior remains covered in the Docker driver README and docs/reference/gateway-config.mdx; no docs navigation change appears necessary.

Next state: gator:in-review

@elezar

elezar commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After CI Update

I re-evaluated latest head 3ed9a9796ac83be84c772e8e071da15d41a3b3c5 after the required gates completed.

Disposition: partially resolved, but not ready to leave review.

Remaining items:

  • Resolved: OpenShell / Branch Checks, OpenShell / Helm Lint, DCO, OpenShell / E2E, and OpenShell / GPU E2E are now passing for this head.
  • Still unresolved: the review warning in e2e/with-docker-gateway.sh about pinning the local supervisor image build platform to the Docker daemon architecture, or explaining why inherited DOCKER_PLATFORM cannot be mismatched in this path.
  • Still unresolved: the crates/openshell-cli/src/ssh.rs tests that call temp_env::with_vars should take TEST_ENV_LOCK, matching the module's existing process-environment tests, to avoid parallel test flakes.

Next state: gator:in-review

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar

elezar commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After Author Update

I re-evaluated latest head c7e03203cb151a73e1f211976878b1d59471e4dc after the head changed from the prior gator:in-review feedback, and ran a bounded independent re-review of the current diff.

Disposition: resolved.

Remaining items:

  • Resolved: the previous local supervisor image architecture concern is addressed by building with DOCKER_PLATFORM=linux/<docker-daemon-arch>.
  • Resolved: the Docker supervisor image precedence comment and test coverage are present.
  • No blocking gator items remain.
  • Nonblocking suggestion: the local openshell/supervisor:dev build would be more hermetic if it cleared ambient IMAGE_REGISTRY, DOCKER_OUTPUT, and DOCKER_PUSH when invoking tasks/scripts/docker-build-image.sh.

Docs: relevant Docker supervisor image behavior is covered in the Docker driver README and docs/reference/gateway-config.mdx; no docs navigation change appears necessary.
Checks: OpenShell / Branch Checks, OpenShell / Helm Lint, DCO, OpenShell / E2E, and OpenShell / GPU E2E are passing for this head.
E2E: test:e2e and test:e2e-gpu are applied and passing.

Next state: gator:approval-needed

@elezar elezar added gator:approval-needed Gator completed review; maintainer approval needed and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 25, 2026
@elezar

elezar commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Re-check After Author Update

I re-evaluated latest head c7e03203cb151a73e1f211976878b1d59471e4dc after the author's 2026-06-25 13:37 UTC follow-up noting that the ssh.rs changes were removed from this PR.

Disposition: resolved; no state change is needed.

Remaining items:

  • No blocking gator items remain.
  • The current changed-file set no longer includes crates/openshell-cli/src/ssh.rs, and the previous approval handoff remains valid for this head.

Checks: OpenShell / Branch Checks, OpenShell / Helm Lint, DCO, OpenShell / E2E, and OpenShell / GPU E2E are passing.

Next state: gator:approval-needed

@elezar elezar changed the title fix(e2e): stabilize local Docker smoke test fix(docker): honor configured supervisor image Jun 25, 2026
Comment thread crates/openshell-driver-docker/src/lib.rs
@elezar elezar merged commit d93293a into main Jun 25, 2026
49 of 50 checks passed
@elezar elezar deleted the fix-local-e2e branch June 25, 2026 14:51
@elezar

elezar commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

gator-agent

Monitoring Complete

Monitoring is complete because this PR has merged.

Final status: the last gator state was gator:approval-needed; required Branch Checks, Helm Lint, DCO, E2E, and GPU E2E were passing for the merged head.

I removed the active gator:* label because there is nothing left for gator to monitor on this PR.

@elezar elezar removed the gator:approval-needed Gator completed review; maintainer approval needed label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage test:e2e-gpu Requires GPU end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants